Designing Semantic Kernels as Implicit Superconcept Expansions

نویسندگان

  • Stephan Bloehdorn
  • Roberto Basili
  • Marco Cammisa
  • Alessandro Moschitti
چکیده

Recently, there has been an increased interest in the exploitation of background knowledge in the context of text mining tasks, especially text classification. At the same time, kernel-based learning algorithms like Support Vector Machines have become a dominant paradigm in the text mining community. Amongst other reasons, this is also due to their capability to achieve more accurate learning results by replacing standard linear kernel (bag-of-words) with customized kernel functions which incorporate additional apriori knowledge. In this paper we propose a new approach to the design of ‘semantic smoothing kernels’ by means of an implicit superconcept expansion using well-known measures of term similarity. The experimental evaluation on two different datasets indicates that our approach consistently improves performance in situations where (i) training data is scarce or (ii) the bag-ofwords representation is too sparse to build stable models when using the linear kernel.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Efficient Linearization of Tree Kernel Functions

The combination of Support Vector Machines with very high dimensional kernels, such as string or tree kernels, suffers from two major drawbacks: first, the implicit representation of feature spaces does not allow us to understand which features actually triggered the generalization; second, the resulting computational burden may in some cases render unfeasible to use large data sets for trainin...

متن کامل

Conceptual Lexicon Using an Object-Oriented Language

This paper describes the construction of a lexicon representing abstract concepts. This lexicon is written by an object-oriented language, CTALK, and forms a dynamic network system controlled by object-oriented mechanisms. The content of the lexicon is constructed using a Japanese dictionary. First, entry words and their definition parts are derived from the dictionary. Second, syntactic and se...

متن کامل

Kernel Methods for Mining Instance Data in Ontologies

The amount of ontologies and meta data available on the Web is constantly growing. The successful application of machine learning techniques for learning of ontologies from textual data, i.e. mining for the Semantic Web, contributes to this trend. However, no principal approaches exist so far for mining from the Semantic Web. We investigate how machine learning algorithms can be made amenable f...

متن کامل

Sub-exponentially Localized Kernels and Frames Induced by Orthogonal Expansions

The aim of this paper is to construct sup-exponentially localized kernels and frames in the context of classical orthogonal expansions, namely, expansions in Jacobi polynomials, spherical harmonics, orthogonal polynomials on the ball and simplex, and Hermite and Laguerre functions.

متن کامل

Multivariable Christoffel–Darboux Kernels and Characteristic Polynomials of Random Hermitian Matrices

We study multivariable Christoffel–Darboux kernels, which may be viewed as reproducing kernels for antisymmetric orthogonal polynomials, and also as correlation functions for products of characteristic polynomials of random Hermitian matrices. Using their interpretation as reproducing kernels, we obtain simple proofs of Pfaffian and determinant formulas, as well as Schur polynomial expansions, ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006